Search CORE

22 research outputs found

Faster Approximate String Matching for Short Patterns

Author: A. Andersson
A.H. Wright
D. Gusfield
D. Harel
D.E. Knuth
E. Ukkonen
E. Ukkonen
E.W. Myers
F.T. Leighton
G. Myers
G. Navarro
G.M. Landau
H. Hyyrö
K.E. Batcher
M. Farach-Colton
M.A. Bender
P. Bille
P. Sellers
Philip Bille
R. Baeza-Yates
R. Cole
R.A. Baeza-Yates
R.A. Wagner
S. Albers
S. Alstrup
S. Wu
S.C. Sahinalp
T. Hagerup
T.H. Cormen
V.L. Arlazarov
W. Masek
Z. Galil
Z. Galil
Publication venue
Publication date: 17/03/2011
Field of study

We study the classical approximate string matching problem, that is, given strings

P

and

Q

and an error threshold

k

, find all ending positions of substrings of

Q

whose edit distance to

P

is at most

k

. Let

P

and

Q

have lengths

m

and

n

, respectively. On a standard unit-cost word RAM with word size

w \geq \log n

we present an algorithm using time

O(nk \cdot \min(\frac{\log^2 m}{\log n},\frac{\log^2 m\log w}{w}) + n)

When

P

is short, namely,

m = 2^{o(\sqrt{\log n})}

m = 2^{o(\sqrt{w/\log w})}

this improves the previously best known time bounds for the problem. The result is achieved using a novel implementation of the Landau-Vishkin algorithm based on tabulation and word-level parallelism.Comment: To appear in Theory of Computing System

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

Nested Counters in Bit-Parallel String Matching

Author: G. Das
G. Myers
G. Navarro
G. Navarro
G.M. Landau
H. Hyyrö
J. Kuri
K. Fredriksson
K. Fredriksson
M. Pǎtraşcu
M.L. Fredman
R.A. Baeza-Yates
S. Grabowski
W. Chang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

A method for automatically extracting infectious disease-related primers and probes from the literature

Author: A Loy
Alejandro Cuevas
BS Rice
D Betel
DA Benson
David Pérez-Rey
Diana de la Iglesia
EA Mothershed
F Li
F Pattyn
Fernando Martín-Sánchez
G De la Calle
Guillermo de la Calle
Guillermo López-Campos
H González-Díaz
H Hyyrö
HD VanGuilder
HP Lee
J Stajich
J Tamames
J Tarhio
JJ Rocchio
José Crespo
K Pabbaraju
L Hirschman
LL Cheng
LT Bravo
M Minsky
MB Miller
MC Enright
MG Campi
Miguel García-Remesal
National Center for Biotechnology Information
P Harmon
PC Woo
R McDonald
RM Ratcliff
SF Altschul
Victoria López-Alonso
Víctor Maojo
YC Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Increased bit-parallelism for approximate and multiple string matching

Author: Allauzen C.
Gonzalo Navarro
Heikki Hyyrö
Hyyrö H.
Hyyrö H.
Hyyrö H.
Kimmo Fredriksson
Levenshtein V.
Muth R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

An efficient algorithm for generating super condensed neighborhoods

Author: A.L. Cobbs
D. Gusfield
E. Ukkonen
G. Myers
G. Navarro
G. Navarro
G. Navarro
H. Hyyrö
H. Hyyrö
H. Hyyrö
R. Baeza-Yates
R. Baeza-Yates
S. Wu
Publication venue: Springer
Publication date: 01/01/2005
Field of study

Abstract. Indexing methods for the approximate string matching problem spend a considerable effort generating condensed neighborhoods. Here, we point out that condensed neighborhoods are not a minimal representation of a pattern neighborhood. We show that we can restrict our attention to super condensed neighborhoods which are minimal. We then present an algorithm for generating Super Condensed Neighborhoods. The algorithm runs in O(m⌈m/w⌉s), where m is the pattern size, s is the size of the super condensed neighborhood and w the size of the processor word. Previous algorithms took O(m⌈m/w⌉c) time, where c is the size of the condensed neighborhood. We further improve this algorithm by using Bit-Parallelism and Increased Bit-Parallelism techniques. Our experimental results show that the resulting algorithm is very fast.

CiteSeerX

Crossref

New Bit-Parallel Indel-Distance Algorithm

Author: A. Wright
E. Ukkonen
G. Myers
G. Navarro
H. Hyyrö
H. Hyyrö
R. Baeza-Yates
S. Wu
V.I. Levenshtein
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

Dynamic Edit Distance Table under a General Weighted Cost Function

Author: G.M. Landau
H. Hyyrö
J.P. Schmidt
S.R. Kim
W.J. Masek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

On finding k-cliques in k-partite graphs

Author: H. Hyyrö
H. Hyyrö
M. Mirghorbani
P. Krokhmal
P. Krokhmal
P. San Segundo
R.D. Luce
S. Grabowski
T. Grunert
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Fast subtrajectory similarity search in road networks under weighted edit distance constraints

Author: Hyyrö H.
Idé T.
Mokbel M. F.
Waury R.
Waury R.
Xia Y.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

A Practical Index for Genome Searching

Author: E. Myers
E. Ukkonen
G. Myers
G. Navarro
G. Navarro
G. Navarro
G. Navarro
H. Hyyrö
H.E. Williams
Publication venue: Springer
Publication date: 01/01/2003
Field of study

Current search tools for computational biology trade e#- ciency for precision, losing many relevant matches. We push in the direction of obtaining maximum e#ciency from an indexing scheme that does not lose any relevant match. We show that it is feasible to search the human genome e#ciently on an average desktop computer

CiteSeerX

Crossref